[OpenTelemetry] Improve HistogramExplicitBounds by martincostello · Pull Request #7165 · open-telemetry/opentelemetry-dotnet

martincostello · 2026-04-25T12:29:04Z

Fixes #3428

Changes

Apply recommendations from #3428:

Use SIMD vectorization to find buckets.
Use radix sort.
Apply cap to histogram explicit bounds (10M).

Written primarily by Copilot through an iterative approach.

Final benchmark results are shown below: #7165 (comment)

I haven't benchmarked the ARM64 SIMD branch as I don't have the hardware for that locally.

Merge requirement checklist

CONTRIBUTING guidelines followed (license requirements, nullable enabled, static analysis, etc.)
Unit tests added/updated
Appropriate CHANGELOG.md files updated for non-trivial changes
~~Changes in public API reviewed (if applicable)~~

Apply recommendations from open-telemetry#3428: - Use SIMD vectorization to find buckets. - Use radix sort. - Apply cap to histogram explicit bounds (10M). Written primarily by Copilot through an iterative approach.

codecov · 2026-04-25T12:34:59Z

Codecov Report

❌ Patch coverage is 95.76271% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.75%. Comparing base (e4762a0) to head (6e87e47).

Files with missing lines	Patch %	Lines
...c/OpenTelemetry/Metrics/HistogramExplicitBounds.cs	95.28%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7165      +/-   ##
==========================================
- Coverage   89.82%   89.75%   -0.08%     
==========================================
  Files         276      276              
  Lines       14738    14822      +84     
==========================================
+ Hits        13238    13303      +65     
- Misses       1500     1519      +19

Flag	Coverage Δ
unittests-Project-Experimental	`89.63% <95.76%> (-0.10%)`	⬇️
unittests-Project-Stable	`89.66% <95.76%> (-0.07%)`	⬇️
unittests-Solution	`89.46% <95.76%> (-0.15%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/OpenTelemetry/Metrics/MetricStreamIdentity.cs	`87.56% <100.00%> (+0.19%)`	⬆️
...trics/View/ExplicitBucketHistogramConfiguration.cs	`89.47% <100.00%> (+6.14%)`	⬆️
...c/OpenTelemetry/Metrics/HistogramExplicitBounds.cs	`95.68% <95.28%> (-4.32%)`	⬇️

... and 6 files with indirect coverage changes

Add missing patch coverage.

Fix CS8604 warning.

Add note for histogram boundary upper limit.

Copilot

Pull request overview

Improves histogram explicit-bound bucket lookup performance in the metrics pipeline, and adds a hard cap on the number of explicit boundaries to prevent pathological allocations/processing.

Changes:

Replaces the previous large-bound lookup strategy with a radix-partitioned search + SIMD-accelerated linear scan fallback.
Adds a 10,000,000 max explicit boundary count check for both view configuration and instrument advice.
Adds unit tests, fuzz/property tests, and benchmarks; updates changelog and solution/project wiring.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
test/OpenTelemetry.Tests/Metrics/MetricViewTests.cs	Adds unit test validating boundary-count limit exception details.
test/OpenTelemetry.Tests/Metrics/AggregatorTests.cs	Adds coverage for large-bound lookup edge cases (mixed signs, infinities, NaN, degenerate NaN bounds) and display-bounds cleanup.
test/OpenTelemetry.FuzzTests/OpenTelemetry.FuzzTests.csproj	New fuzz test project for OpenTelemetry core metrics.
test/OpenTelemetry.FuzzTests/Metrics/HistogramExplicitBoundsFuzzTests.cs	Adds FsCheck property tests comparing optimized path vs scalar baseline.
test/Benchmarks/Metrics/HistogramExplicitBoundsBenchmarks.cs	Adds BenchmarkDotNet benchmarks for bucket lookup across sizes/layouts.
src/OpenTelemetry/OpenTelemetry.csproj	Grants InternalsVisibleTo for the new OpenTelemetry.FuzzTests project.
src/OpenTelemetry/Metrics/View/ExplicitBucketHistogramConfiguration.cs	Introduces MaxBoundaryCount and enforces it when setting Boundaries.
src/OpenTelemetry/Metrics/MetricStreamIdentity.cs	Enforces MaxBoundaryCount for instrument advice-provided boundaries.
src/OpenTelemetry/Metrics/HistogramExplicitBounds.cs	Implements radix-partitioned + SIMD-assisted bucket index lookup and infinity cleanup.
src/OpenTelemetry/CHANGELOG.md	Documents the explicit-boundary cap as a breaking change.
OpenTelemetry.slnx	Adds the new fuzz test project to the solution.

Comments suppressed due to low confidence (1)

src/OpenTelemetry/Metrics/View/ExplicitBucketHistogramConfiguration.cs:74

IsSortedAndDistinct currently treats double.NaN as valid because all comparisons with NaN return false, so arrays containing NaN can pass validation even though they are not meaningfully “ascending order with distinct values”. Consider explicitly rejecting NaN (and possibly non-finite values) in this check so Boundaries validation matches the documented requirements and downstream bucketing assumptions.

        for (var i = 1; i < values.Length; i++)
        {
            if (values[i] <= values[i - 1])
            {
                return false;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Fix incorrect ArgumentException parameter name.

Kielek · 2026-05-19T11:10:35Z

@cijothomas, could you please check this PR? Especially adding boundaries?

Kielek

Some direct feedback while reviewing by Codex, in general LGTM.

It will be great to see Cijo final review, but I will not block this.

@martincostello, I think that you need to solve problems with MemoryMarshal, your preious changes are working ;)

Remove `MemoryMarshal` usage after merging with main to remove unsafe code.

martincostello · 2026-06-09T10:56:05Z

Need to re-run the benchmarks with the latest changes.

martincostello · 2026-06-09T14:29:36Z

Benchmark comparison: `main` (baseline) vs `gh-3428` (candidate) (generated by Claude 🐙)

Benchmarks.Metrics.HistogramExplicitBoundsBenchmarks — BenchmarkDotNet v0.15.8, .NET 10.0.8 (X64 RyuJIT), 13th Gen Intel Core i7-13700H.

TL;DR

Allocations: unchanged. Every benchmark allocates 0 B on both branches — this change is allocation-neutral.
Special-case lookups (±Infinity, NaN) are dramatically faster. These now short-circuit instead of scanning the bounds, dropping from ~4–21 ns to well under 1 ns (often ~0).
General finite-value lookups are ~50% faster at larger bound counts. At BoundCount=49, ExactBoundary / InRangeValue / WithInfiniteBounds all roughly halve (~10.5 ns → ~5.1 ns).
At small bound counts (10) the general lookups are roughly neutral (within noise; one case ~5% slower).
⚠️ BoundCount=50 and BoundCount=1000 comparisons for the finite-value lookups are unreliable — several baseline cells report sub-nanosecond means (e.g. 0.0007 ns, 0.0308 ns), which indicate the JIT eliminated the operation as dead code. The candidate measures real work in those cells, so the apparent "regression" there is a measurement artifact, not a true slowdown. I've excluded those from the conclusions above.

Representative results (.NET 10.0, Mean)

Cases below are where both branches measure real work, so the comparison is sound:

Scenario (BoundCount / Layout)	Method	Baseline	Candidate	Δ
49 / MixedSigned	LookupExactBoundary	10.70 ns	5.14 ns	−52%
49 / MixedSigned	LookupInRangeValue	10.89 ns	5.03 ns	−54%
49 / MixedSigned	LookupWithInfiniteBounds	10.45 ns	5.12 ns	−51%
49 / MixedSigned	LookupPositiveInfinity	20.06 ns	0.54 ns	−97%
49 / MixedSigned	LookupNaN	15.67 ns	0.05 ns	−100%
49 / PositiveOnly	LookupExactBoundary	10.41 ns	4.98 ns	−52%
49 / PositiveOnly	LookupInRangeValue	10.34 ns	4.85 ns	−53%
49 / PositiveOnly	LookupPositiveInfinity	21.02 ns	0.46 ns	−98%
49 / PositiveOnly	LookupNaN	15.45 ns	~0 ns	−100%
10 / MixedSigned	LookupPositiveInfinity	7.84 ns	0.64 ns	−92%
10 / MixedSigned	LookupNaN	2.87 ns	0.02 ns	−99%
10 / MixedSigned	LookupExactBoundary	2.47 ns	2.18 ns	−12%
10 / MixedSigned	LookupWithInfiniteBounds	2.50 ns	2.26 ns	−10%
10 / PositiveOnly	LookupPositiveInfinity	4.48 ns	0.50 ns	−89%
10 / PositiveOnly	LookupNaN	3.04 ns	~0 ns	−100%
10 / PositiveOnly	LookupExactBoundary	2.48 ns	2.61 ns	+5% (noise)
10 / PositiveOnly	LookupInRangeValue	2.57 ns	2.41 ns	−6%

The ±Infinity improvement also holds for the 50 and 1000 bound counts (e.g. LookupNegativeInfinity at 1000/MixedSigned: 3.29 ns → 0.18 ns; LookupPositiveInfinity: 4.13 ns → 0.48 ns) — these baseline cells were not dead-code-eliminated, so they're trustworthy.

Conclusion

A clear performance win with no allocation cost: infinity/NaN edge-case lookups are now effectively free, and the common finite-value lookups are ~50% faster once the bound array is large enough to matter. No genuine regressions were found — the only "slower" numbers come from baseline cells the JIT had optimized away, where the candidate is simply being measured doing real work.

Note: the .NET Framework 4.6.2 rows show the same trend and the same dead-code-elimination artifacts at BoundCount 50/1000.

[OpenTelemetry] Improve HistogramExplicitBounds

bf42e2d

Apply recommendations from open-telemetry#3428: - Use SIMD vectorization to find buckets. - Use radix sort. - Apply cap to histogram explicit bounds (10M). Written primarily by Copilot through an iterative approach.

github-actions Bot added pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package perf Performance related labels Apr 25, 2026

martincostello added 2 commits April 25, 2026 13:56

[OpenTelemetry] Extend test coverage

dd6356b

Add missing patch coverage.

[OpenTelemetry] Fix build

42df801

Fix CS8604 warning.

This comment was marked as outdated.

Sign in to view

[OpenTelemetry] Update CHANGELOG

ed7007c

Add note for histogram boundary upper limit.

martincostello marked this pull request as ready for review April 25, 2026 14:55

martincostello requested a review from a team as a code owner April 25, 2026 14:55

Copilot AI review requested due to automatic review settings April 25, 2026 14:55

Copilot started reviewing on behalf of martincostello April 25, 2026 14:55 View session

Copilot AI reviewed Apr 25, 2026

View reviewed changes

Comment thread src/OpenTelemetry/Metrics/MetricStreamIdentity.cs Outdated

martincostello commented Apr 25, 2026

View reviewed changes

Comment thread src/OpenTelemetry/Metrics/MetricStreamIdentity.cs Outdated

martincostello added 3 commits April 25, 2026 16:22

[OpenTelemetry] Address feedback

5a47c8b

Fix incorrect ArgumentException parameter name.

Merge branch 'main' into open-telemetrygh-3428

5475a99

Merge branch 'main' into open-telemetrygh-3428

8e75703

martincostello added the keep-open Prevents issues and pull requests being closed as stale label May 7, 2026

Merge branch 'main' into open-telemetrygh-3428

dcf2c50

Kielek requested a review from cijothomas May 19, 2026 11:10

Merge branch 'main' into open-telemetrygh-3428

1b4eb37

rajkumar-rangaraj approved these changes Jun 9, 2026

View reviewed changes

Fix NaN handling in histogram bounds

a4554f2

Kielek approved these changes Jun 9, 2026

View reviewed changes

martincostello added 3 commits June 9, 2026 09:57

Merge branch 'main' into open-telemetrygh-3428

57314f2

[OpenTelemetry] Remove MemoryMarshal usage

804da78

Remove `MemoryMarshal` usage after merging with main to remove unsafe code.

Merge branch 'main' into open-telemetrygh-3428

6e87e47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[OpenTelemetry] Improve HistogramExplicitBounds#7165

[OpenTelemetry] Improve HistogramExplicitBounds#7165
martincostello merged 13 commits into
open-telemetry:mainfrom
martincostello:gh-3428

martincostello commented Apr 25, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 25, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Kielek commented May 19, 2026

Uh oh!

Kielek left a comment •

edited

Loading

Uh oh!

martincostello commented Jun 9, 2026

Uh oh!

martincostello commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

martincostello commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Merge requirement checklist

Uh oh!

codecov Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

This comment was marked as outdated.

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Kielek commented May 19, 2026

Uh oh!

Kielek left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martincostello commented Jun 9, 2026

Uh oh!

martincostello commented Jun 9, 2026

Benchmark comparison: main (baseline) vs gh-3428 (candidate) (generated by Claude 🐙)

TL;DR

Representative results (.NET 10.0, Mean)

Conclusion

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

martincostello commented Apr 25, 2026 •

edited

Loading

codecov Bot commented Apr 25, 2026 •

edited

Loading

Kielek left a comment •

edited

Loading

Benchmark comparison: `main` (baseline) vs `gh-3428` (candidate) (generated by Claude 🐙)